# Multimodal reasoning enhancement
Internvl3 38B Instruct
Apache-2.0
InternVL3-38B-Instruct is an advanced multimodal large language model (MLLM) that demonstrates exceptional multimodal perception and reasoning capabilities, supporting various tasks such as tool usage, GUI agents, industrial image analysis, and 3D visual perception.
Text-to-Image
Transformers Other

I
OpenGVLab
468
3
Visualprm 8B V1 1
MIT
VisualPRM-8B-v1.1 is an advanced multimodal process reward model with 8 billion parameters, which enhances the reasoning ability of large multimodal language models through the Best-of-N evaluation strategy.
Multimodal Fusion
Transformers

V
OpenGVLab
249
6
Featured Recommended AI Models